Mapping Greater Sage Grouse (Centrocercus urophasianus) occcurrence and habitat on BLM land in Wyoming¶

The Greater sage-grouse (Centrocercus urophasianus) is a threatened species endemic to the Sagebrush steppe ecosystem of Western North America (Prochazka et al., 2025). The Bureau of Land Management (BLM) manages the largest share of Greater sage-grouse habitat in the United States, a total of 65 million acres (BLM, 2024). I investigated Greater sage-grouse occurrence patterns across BLM land using Global Biodiversity Information Facility (GBIF) data to assess if high occurrence was associated with high-priority BLM habitat.

Credit¶

Original GBIF code workflow credit¶

The basis for this code comes from and Earth Lab / ESIIL Environmental Data Science Assignment, linked here.

Use of AI disclaimer¶

I used Claude Sonnet 4.5 to troubleshoot some of this workflow. It was most helpful in helping clean-up legend scale bars and add clean Natural Earth features into my maps.

Code Workflow Table of Contents¶

Step 0: Define directories and load libraries¶

Step 1: Download GBIF occurrence data¶

  • 1.1: Link GBIF to your Jupyter notebook
  • 1.2: Check species name before downloading
  • 1.3: Download occurrence data
  • 1.4: Convert to geodataframe

Step 2: Load in BLM Habitat Management Areas (HMAs)¶

  • 2.1: Load data and calculate HMA shape area (m)
  • 2.2: BLM region-specific habitat types
  • 2.3: Plot by region name
  • 2.4: Plot by priority

Step 3: Spatially join BLM HMAs and GBIF observations¶

Step 4: Normalize observation data¶

  • 4.1: Density of observations / area of HMA
  • 4.2: Remove rare observations
  • 4.3: Mean number of occurrences per HMA
  • 4.4: Mean number of occurrences per year
  • 4.5: Normalize by areal density, mean occurrences per year, and mean occurrences per HMA

Step 5: Map GRSG occurrence across HMAs¶

  • 5.1: Join HMAs (geodataframe) to normalized occurrence data
  • 5.2: Plot normalized occurrences by HMA and year
  • 5.3: Compare with raw occurrence data (GBIF)
  • 5.4: Log transform raw occurrence data to smooth hotspots

Step 0.1: Install dependencies and load libraries¶

Need to install before proceeding:

pip install pygbif

In [1]:
import pathlib
import os

import cartopy.crs as ccrs
import cartopy.feature as cfeature # static plot
import colorsys
import hvplot.pandas
import geopandas as gpd
from getpass import getpass
from glob import glob
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors # static plot
from matplotlib.patches import Patch # good legends
import numpy as np
import pandas as pd
import pygbif.occurrences as occ
import pygbif.species as species
import time
import zipfile
In [2]:
### Overall data directory
grsg_dir = os.path.join(

    ### home directory
    pathlib.Path.home(),

    ### data directory within migration git repository
    'Documents',
    'Earth Data Cert',
    'Earth-Analytics-AY25',
    'GitRepos',
    'blm-grsg',
    'data')

# Make the directory
os.makedirs(grsg_dir, exist_ok = True)


### GBIF Data Directory
gbif_dir = os.path.join(grsg_dir, 'centrocercusUrophasianus_gbif')

# Make the directory
os.makedirs(gbif_dir, exist_ok = True)



### BLM Habitat Management Areas (HMAs) Directory
blm_dir = os.path.join(grsg_dir, 'blm_hmas')

# Make dir
os.makedirs(blm_dir, exist_ok = True)



### Sagebrush Conservation Design (SCD) Directory
scd_dir = os.path.join(grsg_dir, 'scd')

# Make dir
os.makedirs(scd_dir, exist_ok = True)

Step 1: Download GBIF Occurrence Data¶

Step 1.1: link GBIF to your Jupyter notebook¶

  • Change to reset = True the first time you run it. When you run it, you'll be prompted to enter your GBIF username, password, and associated email address
  • Then change it back to reset = False so you can re-run the chunk without having to reconnect to GBIF
  • You don’t need to modify this code chunk in any other way
In [3]:
####--------------------------####
#### DO NOT MODIFY THIS CODE! ####
####--------------------------####
# This code ASKS for your credentials 
# and saves it for the rest of the session.
# NEVER put your credentials into your code!!!!

# GBIF needs a username, password, and email 
# All 3 need to match the account
reset = False

# Request and store username
if (not ('GBIF_USER'  in os.environ)) or reset:
    os.environ['GBIF_USER'] = input('GBIF username:')

# Securely request and store password
if (not ('GBIF_PWD'  in os.environ)) or reset:
    os.environ['GBIF_PWD'] = getpass('GBIF password:')
    
# Request and store account email address
if (not ('GBIF_EMAIL'  in os.environ)) or reset:
    os.environ['GBIF_EMAIL'] = input('GBIF email:')

Step 1.2: Check species name before downloading¶

In [4]:
### grab the species info
backbone = species.name_backbone(name = 'Centrocercus urophasianus')

### check it out
print(backbone)

### pull out the species key
species_key = backbone['usageKey']

### check it out
print("Usage key:", species_key)
{'usageKey': 5959240, 'scientificName': 'Centrocercus urophasianus (Bonaparte, 1827)', 'canonicalName': 'Centrocercus urophasianus', 'rank': 'SPECIES', 'status': 'ACCEPTED', 'confidence': 99, 'matchType': 'EXACT', 'kingdom': 'Animalia', 'phylum': 'Chordata', 'order': 'Galliformes', 'family': 'Phasianidae', 'genus': 'Centrocercus', 'species': 'Centrocercus urophasianus', 'kingdomKey': 1, 'phylumKey': 44, 'classKey': 212, 'orderKey': 723, 'familyKey': 9331, 'genusKey': 5959239, 'speciesKey': 5959240, 'class': 'Aves'}
Usage key: 5959240

Step 1.3: Download occurrence data¶

In [5]:
# Only download once
### set file name for download
gbif_pattern = os.path.join(gbif_dir, '*.csv')

### double check that there isn't already a file that matches this pattern.
### if it already exists, skip the whole conditional
### and go straight to the line: gbif_path = glob(gbif_pattern)[0]
if not glob(gbif_pattern):

    ### only submit a download request to GBIF once
    ### if GBIF_DOWNLOAD_KEY is not defined in our environment, make the download request
    if not 'GBIF_DOWNLOAD_KEY' in os.environ:

        ### submit a query to GBIF
        gbif_query = occ.download([

            ### add your species key here
            f"speciesKey = {species_key}",

            ### filter out results that are missing coordinates
            "hasCoordinate = True",

            ### choose a year to include
            "year = 2015,2024",
        ])
        os.environ['GBIF_DOWNLOAD_KEY'] = gbif_query[0]

    # Wait for the download to build
    download_key = os.environ['GBIF_DOWNLOAD_KEY']

    ### use the occurrence command module in pygbif to get the metadata
    wait = occ.download_meta(download_key)['status']

    ### check if the status of the download = "SUCCEEDED"
    ### wait and loop through until it finishes
    while not wait=='SUCCEEDED':
        wait = occ.download_meta(download_key)['status']

        ### don't want to re-query the API in the loop too frequently
        time.sleep(5)

    # Download GBIF data when it's ready
    download_info = occ.download_get(
        os.environ['GBIF_DOWNLOAD_KEY'], 
        path=gbif_dir)

    # Unzip GBIF data using the zipfile package
    with zipfile.ZipFile(download_info['path']) as download_zip:
        download_zip.extractall(path=gbif_dir)

# Find the extracted .csv file path (take the first result)
gbif_path = glob(gbif_pattern)[0]
In [6]:
# GBIF data path
gbif_path = os.path.join(gbif_dir, "0023259-251120083545085.csv")

# load csv
gbif_data = pd.read_csv(
    gbif_path, 
    delimiter='\t',
    index_col='gbifID',
   usecols=['gbifID', 'decimalLatitude', 'decimalLongitude', 'month', 'year']
    )
gbif_data.head()
Out[6]:
decimalLatitude decimalLongitude month year
gbifID
5892763565 43.756638 -110.667275 4 2024
5837855429 42.538211 -110.096143 9 2024
5827911939 42.105447 -110.435588 7 2023
5759742498 45.038401 -108.962072 4 2016
5716416504 48.167900 -106.964000 9 2016

Step 1.4: Convert to geodataframe¶

In [7]:
# Convert to GeoDataFrame
gbif_gdf = (
    gpd.GeoDataFrame(
        gbif_data, 
        geometry=gpd.points_from_xy(
            gbif_data.	decimalLongitude, 
            gbif_data.decimalLatitude), 
        crs="EPSG:4326")
    # Select the desired columns
    [['year','geometry']]
)
gbif_gdf
Out[7]:
year geometry
gbifID
5892763565 2024 POINT (-110.66728 43.75664)
5837855429 2024 POINT (-110.09614 42.53821)
5827911939 2023 POINT (-110.43559 42.10545)
5759742498 2016 POINT (-108.96207 45.0384)
5716416504 2016 POINT (-106.964 48.1679)
... ... ...
1211973730 2015 POINT (-110.63114 43.64882)
1211972443 2015 POINT (-109.03738 40.48684)
1144280784 2015 POINT (-113.54315 43.28434)
1143532587 2015 POINT (-110.67129 43.75164)
1132408514 2015 POINT (-113.52872 43.5985)

13686 rows × 2 columns

Step 2: Load in BLM Habitat Management Areas (HMAs)¶

  • These were downloaded from the BLM GIS Data Portal

Step 2.1: Lead data and calculate HMA shape area (m)¶

In [8]:
# Read in HMA geopackage
blm_gdf = gpd.read_file(os.path.join(blm_dir,'BLM_WesternUS_GRSG_ROD_HabitatMgmtAreas_Feb_2025_-297793168263603713.gpkg'))

# Check data
blm_gdf.head(10)
Out[8]:
EIS_HAB Source Habitat_Type geometry
0 BighornBasin_GH_ROD ROD_HabitatMgmtAreas_Feb2020 GHMA MULTIPOLYGON (((-1025054.083 2515176.322, -102...
1 BighornBasin_PH_ROD ROD_HabitatMgmtAreas_Feb2020 PHMA MULTIPOLYGON (((-1010794.04 2465604.125, -1011...
2 Billings_GH_ROD ROD_HabitatMgmtAreas_Feb2020 GHMA MULTIPOLYGON (((-966098.796 2643394.812, -9661...
3 Billings_PH_ROD ROD_HabitatMgmtAreas_Feb2020 PHMA MULTIPOLYGON (((-1012069.205 2549740.156, -101...
4 Billings_RH_ROD ROD_HabitatMgmtAreas_Feb2020 RHMA MULTIPOLYGON (((-1015485.614 2520246.55, -1015...
5 Buffalo_GH_ROD HabitatManagementArea_Update_May2022 GHMA MULTIPOLYGON (((-762606.762 2316326.368, -7622...
6 Buffalo_PH_ROD HabitatManagementArea_Update_May2022 PHMA MULTIPOLYGON (((-800031.171 2485373.319, -7997...
7 HiLine_PH_ROD ROD_HabitatMgmtAreas_Feb2020 PHMA MULTIPOLYGON (((-902993.058 2866628.093, -9033...
8 Idaho_GH_NoAction ROD_HabitatMgmtAreas_Feb2020 GHMA MULTIPOLYGON (((-1630138.44 2587646.687, -1630...
9 Idaho_IH_NoAction ROD_HabitatMgmtAreas_Feb2020 IHMA MULTIPOLYGON (((-1351863.299 2392520.024, -135...
In [9]:
# Calculate shape area for each EIS_HAB

# Check CRS
print(blm_gdf.crs)

# Calculate area (in units of the CRS - check with blm_gdf.crs)
blm_gdf['area'] = blm_gdf.geometry.area

# Check head
blm_gdf.head(10)
PROJCS["NAD_1983_Albers",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433]],PROJECTION["Albers_Conic_Equal_Area"],PARAMETER["latitude_of_center",23],PARAMETER["longitude_of_center",-96],PARAMETER["standard_parallel_1",29.5],PARAMETER["standard_parallel_2",45.5],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
Out[9]:
EIS_HAB Source Habitat_Type geometry area
0 BighornBasin_GH_ROD ROD_HabitatMgmtAreas_Feb2020 GHMA MULTIPOLYGON (((-1025054.083 2515176.322, -102... 1.562871e+10
1 BighornBasin_PH_ROD ROD_HabitatMgmtAreas_Feb2020 PHMA MULTIPOLYGON (((-1010794.04 2465604.125, -1011... 7.212493e+09
2 Billings_GH_ROD ROD_HabitatMgmtAreas_Feb2020 GHMA MULTIPOLYGON (((-966098.796 2643394.812, -9661... 1.268364e+10
3 Billings_PH_ROD ROD_HabitatMgmtAreas_Feb2020 PHMA MULTIPOLYGON (((-1012069.205 2549740.156, -101... 3.304169e+09
4 Billings_RH_ROD ROD_HabitatMgmtAreas_Feb2020 RHMA MULTIPOLYGON (((-1015485.614 2520246.55, -1015... 4.122845e+08
5 Buffalo_GH_ROD HabitatManagementArea_Update_May2022 GHMA MULTIPOLYGON (((-762606.762 2316326.368, -7622... 2.040145e+10
6 Buffalo_PH_ROD HabitatManagementArea_Update_May2022 PHMA MULTIPOLYGON (((-800031.171 2485373.319, -7997... 4.409289e+09
7 HiLine_PH_ROD ROD_HabitatMgmtAreas_Feb2020 PHMA MULTIPOLYGON (((-902993.058 2866628.093, -9033... 9.542681e+09
8 Idaho_GH_NoAction ROD_HabitatMgmtAreas_Feb2020 GHMA MULTIPOLYGON (((-1630138.44 2587646.687, -1630... 1.759068e+10
9 Idaho_IH_NoAction ROD_HabitatMgmtAreas_Feb2020 IHMA MULTIPOLYGON (((-1351863.299 2392520.024, -135... 1.828142e+10
In [10]:
# Plot real quick
blm_gdf.plot()
Out[10]:
<Axes: >
No description has been provided for this image

Step 2.2: BLM region-specific habitat types¶

  • Control for region specific habitat type designations by assigning a priority value to each for simpler plotting

Habitat Types in our data

Acronym Full Name Region of Use Assigned Priority Tier
PHMA Priority Habitat Management Area Range-wide High
IHMA Important Habitat Management Area ID, MT High
AM Anthro Mountain Habitat Management Area UT High
RHMA Restoration Habitat Management Area MT, ND, SD High
GHMA General Habitat Management Area Range-wide Medium
LCHMA Linkage and Connectivity Habitat Management Area CO Medium
OHMA Other Habitat Management Area NV, CA Low
In [11]:
# Simplify names for better plotting
# Split EIS_HAB by underscore into new columns
blm_gdf[['Area', 'HabCode', 'Source']] = blm_gdf['EIS_HAB'].str.split('_', expand=True)

# Create a 'priorty' column that will make plotting cleaner
blm_gdf['priority'] = blm_gdf['HabCode'].map({
    'PH': 'High',
    'IH': 'High',
    'AM': 'High',
    'RH': 'High',
    'GH': 'Medium',
    'LCH': 'Medium',
    'OH': 'Low'
})

# Check
blm_gdf.head()
Out[11]:
EIS_HAB Source Habitat_Type geometry area Area HabCode priority
0 BighornBasin_GH_ROD ROD GHMA MULTIPOLYGON (((-1025054.083 2515176.322, -102... 1.562871e+10 BighornBasin GH Medium
1 BighornBasin_PH_ROD ROD PHMA MULTIPOLYGON (((-1010794.04 2465604.125, -1011... 7.212493e+09 BighornBasin PH High
2 Billings_GH_ROD ROD GHMA MULTIPOLYGON (((-966098.796 2643394.812, -9661... 1.268364e+10 Billings GH Medium
3 Billings_PH_ROD ROD PHMA MULTIPOLYGON (((-1012069.205 2549740.156, -101... 3.304169e+09 Billings PH High
4 Billings_RH_ROD ROD RHMA MULTIPOLYGON (((-1015485.614 2520246.55, -1015... 4.122845e+08 Billings RH High

Step 2.3: Plot by region name¶

In [12]:
# Plot first by region

# Area color palette
area_colors = {
    'BighornBasin': '#A5A58D',
    'Billings': '#B7B7A4',
    'Buffalo': '#CB997E',
    'HiLine': '#DDBEA9',
    'Idaho': '#E4BAA0',
    'Lander': '#B08968',
    'Lewistown': '#997B66',
    'MilesCity': '#8C6A5D',
    'NinePlan': '#7F6752',
    'NorthDakota': '#6F5E50',
    'NVCA': '#A3A380',
    'SDakota': '#C2C5AA',
    'SWMT': '#8DA47E',
    'Utah': '#7B8F7A',
    'Oregon': '#657B69',
    'NWCO': '#556B5A',
}

# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
    central_longitude=-110,
    central_latitude=40,
    standard_parallels=(30, 50)
)

# Convert using that western US albers
blm_albers = blm_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_albers.total_bounds

# Buffer for adjusting the zoom
buffer = 100000 # this in meters

# Create figure with same projection
fig = plt.figure(figsize=(16, 10))
ax = fig.add_subplot(1, 1, 1, projection=projection)
ax.set_extent([
    xmin - buffer, 
    xmax + buffer, 
    ymin - buffer, 
    ymax + buffer], 
    crs=projection)

# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
    category='cultural',
    name='admin_1_states_provinces_lines',
    scale='50m',
    facecolor='none'
)

# Add background features
ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)

# Plot each polygon - look up color from dictionary
for idx, row in blm_albers.iterrows():
    color = area_colors.get(row['Area'], '#999999')  # Default gray if not found
    blm_albers[blm_albers.index == idx].plot(
        ax=ax,
        color=color,
        edgecolor=color,
        linewidth=1,
        zorder=2
    )

# Create legend using the color mapping
legend_handles = [Patch(facecolor=color, edgecolor='#333333', label=area) 
                  for area, color in sorted(area_colors.items())]

ax.legend(
    handles=legend_handles,
    loc='lower right',
    fontsize=8,
    title='Management Area',
    title_fontsize=10,
    ncol=2,
    framealpha=0.9
)

ax.set_title('BLM Greater Sage-Grouse\nHabitat Management Areas', 
             fontsize=18, fontweight='bold', pad=15)

ax.gridlines(draw_labels=True, linewidth=0.5, color='gray', alpha=0.5, linestyle='--')

plt.tight_layout()
plt.show()
No description has been provided for this image

Step 2.4: Plot by priority¶

In [13]:
# Plot priority levels in separate panels

# Priority color palette
prio_colors = {
    'Low': '#aa400d',
    'Medium': '#987F32',
    'High': '#2f501b'
}

# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
    central_longitude=-110,
    central_latitude=40,
    standard_parallels=(30, 50)
)

# Convert using that western US albers
blm_prio_albers = blm_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_prio_albers.total_bounds

# Buffer for adjusting the zoom
buffer = 100000  # this in meters

# Create figure with 3 subplots (1 row, 3 columns)
fig = plt.figure(figsize=(24, 8))

# Define the order of priority levels for panels
priority_order = ['Low', 'Medium', 'High']

# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
    category='cultural',
    name='admin_1_states_provinces_lines',
    scale='50m',
    facecolor='none'
)

# Loop through each priority level and create a subplot
for i, priority in enumerate(priority_order, 1):
    # Create subplot with projection
    ax = fig.add_subplot(1, 3, i, projection=projection)
    
    # Set extent (same for all panels for consistency)
    ax.set_extent([
        xmin - buffer, 
        xmax + buffer, 
        ymin - buffer, 
        ymax + buffer], 
        crs=projection)
    
    # Add background features
    ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
    ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
    ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
    ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
    
    # Filter data for this priority level
    priority_data = blm_prio_albers[blm_prio_albers['priority'] == priority]
    
    # Plot the priority areas
    if not priority_data.empty:
        priority_data.plot(
            ax=ax,
            color=prio_colors[priority],
            edgecolor=prio_colors[priority],
            linewidth=0.1,
            zorder=2
        )
    
    # Add title for each panel
    ax.set_title(f'{priority} Priority', 
                 fontsize=16, fontweight='bold', pad=10)
    
    # Add gridlines
    ax.gridlines(draw_labels=True, linewidth=0.5, color='gray', 
                 alpha=0.5, linestyle='--')

# Add overall title
fig.suptitle('BLM Greater Sage-Grouse Habitat Management Areas by Priority Level', 
             fontsize=20, fontweight='bold', y=0.98)

plt.tight_layout()
plt.show()
No description has been provided for this image
In [14]:
# Dissolve polygons by priority level
blm_prio_gdf = blm_gdf.dissolve(by='priority', as_index=False)

# Reduce columns
blm_prio_gdf = blm_prio_gdf[['priority', 'geometry']]

# Re-calulate area for each polygon
blm_prio_gdf['area'] = blm_prio_gdf.geometry.area

# Export as geopackage
blm_prio_gdf.to_file(os.path.join(blm_dir, 'blm_grsg_priority_dissolved.gpkg'), driver='GPKG')
INFO:Created 3 records
In [15]:
# Check 
blm_prio_gdf
Out[15]:
priority geometry area
0 High MULTIPOLYGON (((-1724455.818 1921096.47, -1724... 2.648593e+11
1 Low MULTIPOLYGON (((-2020229.501 2098929.129, -202... 2.932351e+10
2 Medium MULTIPOLYGON (((-1973743.418 2118522.438, -197... 2.840932e+11
In [16]:
# Plot second by priority

# prior color palette
prio_colors = {
    'Low': '#aa400d',
    'Medium': '#987F32',
    'High': '#2f501b'
}

# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
    central_longitude=-110,
    central_latitude=40,
    standard_parallels=(30, 50)
)

# Convert using that western US albers
blm_prio_albers = blm_prio_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_prio_albers.total_bounds

# Buffer for adjusting the zoom
buffer = 100000 # this in meters

# Create figure with same projection
fig = plt.figure(figsize=(16, 10))
ax = fig.add_subplot(1, 1, 1, projection=projection)
ax.set_extent([
    xmin - buffer, 
    xmax + buffer, 
    ymin - buffer, 
    ymax + buffer], 
    crs=projection)

# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
    category='cultural',
    name='admin_1_states_provinces_lines',
    scale='50m',
    facecolor='none'
)

# Add background features
ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)

# Plot each polygon - look up color from dictionary
for idx, row in blm_prio_albers.iterrows():
    color = prio_colors.get(row['priority'], '#999999')  # Default gray if not found
    blm_prio_albers[blm_prio_albers.index == idx].plot(
        ax=ax,
        color=color,
        edgecolor=color,
        linewidth=0.1,
        #zorder=2
    )

# Create legend using the color mapping
legend_handles = [Patch(facecolor=color, edgecolor='#333333', label=area) 
                  for area, color in sorted(prio_colors.items())]

ax.legend(
    handles=legend_handles,
    loc='lower right',
    fontsize=8,
    title='Priority Level',
    title_fontsize=10,
    ncol=2,
    framealpha=0.9
)

ax.set_title('BLM Greater Sage-Grouse\nHabitat Priority', 
             fontsize=18, fontweight='bold', pad=15)

ax.gridlines(draw_labels=True, linewidth=0.5, color='gray', alpha=0.5, linestyle='--')

plt.tight_layout()
plt.show()
No description has been provided for this image
In [35]:
# Get area by priority from blm_prio_gdf
priority_area = blm_prio_gdf.copy()

# Sort by priority order
priority_order = ['High', 'Medium', 'Low']
priority_area['priority'] = pd.Categorical(
    priority_area['priority'], 
    categories=priority_order, 
    ordered=True
)
priority_area = priority_area.sort_values('priority')

# Convert area to thousand km²
priority_area['area_thousand_km2'] = priority_area['area'] / 1e9

# Create bar plot
fig, ax = plt.subplots(figsize=(8, 6))

# Get colors for each priority
colors = [prio_colors.get(p, '#999999') for p in priority_area['priority']]

bars = ax.bar(priority_area['priority'], priority_area['area_thousand_km2'], 
              color=colors, edgecolor=colors, linewidth=1.5)

ax.set_xlabel('Priority Level', fontsize=12, fontweight='bold')
ax.set_ylabel('Area (thousand km²)', fontsize=12, fontweight='bold')
ax.set_title('Habitat Area by Priority Level', fontsize=14, fontweight='bold')

# Add value labels on bars
for bar in bars:
    height = bar.get_height()
    ax.annotate(f'{height:.1f}',
                xy=(bar.get_x() + bar.get_width() / 2, height),
                xytext=(0, 3), textcoords="offset points",
                ha='center', va='bottom', fontsize=11, fontweight='bold')

plt.tight_layout()
plt.show()
No description has been provided for this image

Step 3: Spatially join BLM HMAs and GBIF observations¶

In [17]:
# Spatial join
gbif_hma_gdf = (
    blm_gdf
    # Match the CRS of the GBIF data and the hmas
    # Note: reproject the smaller dataset to avoid time suck
    .to_crs(gbif_gdf.crs)
    # Find ecoregion for each observation
    .sjoin(
        gbif_gdf,
        how='inner', 
        predicate='contains')
)
gbif_hma_gdf
Out[17]:
EIS_HAB Source Habitat_Type geometry area Area HabCode priority gbifID year
0 BighornBasin_GH_ROD ROD GHMA MULTIPOLYGON (((-109.05687 44.99036, -109.0568... 1.562871e+10 BighornBasin GH Medium 1382647861 2015
0 BighornBasin_GH_ROD ROD GHMA MULTIPOLYGON (((-109.05687 44.99036, -109.0568... 1.562871e+10 BighornBasin GH Medium 1383834604 2015
0 BighornBasin_GH_ROD ROD GHMA MULTIPOLYGON (((-109.05687 44.99036, -109.0568... 1.562871e+10 BighornBasin GH Medium 3168179146 2020
0 BighornBasin_GH_ROD ROD GHMA MULTIPOLYGON (((-109.05687 44.99036, -109.0568... 1.562871e+10 BighornBasin GH Medium 1727319869 2016
0 BighornBasin_GH_ROD ROD GHMA MULTIPOLYGON (((-109.05687 44.99036, -109.0568... 1.562871e+10 BighornBasin GH Medium 1378430833 2015
... ... ... ... ... ... ... ... ... ... ...
37 NWCO_LCH_2025Update 2025Update LCHMA MULTIPOLYGON (((-108.09005 40.28915, -108.0920... 1.377762e+09 NWCO LCH Medium 4091610715 2023
37 NWCO_LCH_2025Update 2025Update LCHMA MULTIPOLYGON (((-108.09005 40.28915, -108.0920... 1.377762e+09 NWCO LCH Medium 4903749174 2024
37 NWCO_LCH_2025Update 2025Update LCHMA MULTIPOLYGON (((-108.09005 40.28915, -108.0920... 1.377762e+09 NWCO LCH Medium 5063829263 2019
37 NWCO_LCH_2025Update 2025Update LCHMA MULTIPOLYGON (((-108.09005 40.28915, -108.0920... 1.377762e+09 NWCO LCH Medium 2139393221 2018
37 NWCO_LCH_2025Update 2025Update LCHMA MULTIPOLYGON (((-108.09005 40.28915, -108.0920... 1.377762e+09 NWCO LCH Medium 2137651098 2018

10554 rows × 10 columns

Step 4: Normalize observation data¶

Step 4.1: Density of observations / area of HMA¶

In [18]:
# Raw occurrence data
occurrence_df = (
    gbif_hma_gdf
    # Reset index
    .reset_index()
    # compare across HMA regions
    .groupby(['EIS_HAB', 'year'])
    # ...count the number of occurrences
    .agg(occurrences=('gbifID', 'count'),
         area=('area', 'first'))
)

# Normalize by area
occurrence_df['density'] = (
    occurrence_df.occurrences
    / occurrence_df.area
)

occurrence_df
Out[18]:
occurrences area density
EIS_HAB year
BighornBasin_GH_ROD 2015 4 1.562871e+10 2.559392e-10
2016 10 1.562871e+10 6.398480e-10
2017 4 1.562871e+10 2.559392e-10
2018 4 1.562871e+10 2.559392e-10
2019 13 1.562871e+10 8.318024e-10
... ... ... ... ...
Utah_PH_NoAction 2020 135 2.283707e+10 5.911442e-09
2021 126 2.283707e+10 5.517346e-09
2022 173 2.283707e+10 7.575403e-09
2023 190 2.283707e+10 8.319807e-09
2024 261 2.283707e+10 1.142879e-08

337 rows × 3 columns

Step 4.2: Remove rare observations¶

In [19]:
# Get rid of rare observations (possible misidentification?)
occurrence_df = occurrence_df[occurrence_df.occurrences>1]

occurrence_df
Out[19]:
occurrences area density
EIS_HAB year
BighornBasin_GH_ROD 2015 4 1.562871e+10 2.559392e-10
2016 10 1.562871e+10 6.398480e-10
2017 4 1.562871e+10 2.559392e-10
2018 4 1.562871e+10 2.559392e-10
2019 13 1.562871e+10 8.318024e-10
... ... ... ... ...
Utah_PH_NoAction 2020 135 2.283707e+10 5.911442e-09
2021 126 2.283707e+10 5.517346e-09
2022 173 2.283707e+10 7.575403e-09
2023 190 2.283707e+10 8.319807e-09
2024 261 2.283707e+10 1.142879e-08

302 rows × 3 columns

Step 4.3: Mean number of occurrences per HMA¶

In [20]:
# Take the mean by priority
mean_occurrences_by_hma = (
    occurrence_df
    .groupby('EIS_HAB')
    .mean()
)

mean_occurrences_by_hma
Out[20]:
occurrences area density
EIS_HAB
BighornBasin_GH_ROD 8.200000 1.562871e+10 5.246754e-10
BighornBasin_PH_ROD 9.555556 7.212493e+09 1.324862e-09
Billings_GH_ROD 6.000000 1.268364e+10 4.730502e-10
Billings_PH_ROD 24.700000 3.304169e+09 7.475405e-09
Buffalo_GH_ROD 10.444444 2.040145e+10 5.119461e-10
Buffalo_PH_ROD 9.666667 4.409289e+09 2.192341e-09
HiLine_GH_NoAction 4.555556 3.797708e+09 1.199554e-09
HiLine_PH_ROD 35.700000 9.542681e+09 3.741087e-09
Idaho_GH_NoAction 32.200000 1.759068e+10 1.830514e-09
Idaho_IH_NoAction 35.700000 1.828142e+10 1.952802e-09
Idaho_PH_NoAction 87.100000 2.435811e+10 3.575811e-09
Lander_GH_ROD 6.400000 4.537717e+09 1.410401e-09
Lander_PH_ROD 20.400000 8.803880e+09 2.317160e-09
Lewistown_GH_ROD 3.444444 4.107464e+09 8.385818e-10
Lewistown_PH_ROD 12.400000 4.888587e+09 2.536520e-09
MilesCity_GH_ROD 9.444444 4.835951e+10 1.952965e-10
MilesCity_PH_ROD 6.375000 1.449690e+10 4.397492e-10
MilesCity_RH_ROD 3.800000 1.385128e+09 2.743429e-09
NVCA_GH_2022Update 11.200000 3.809569e+10 2.939965e-10
NVCA_OH_2022Update 3.375000 2.932351e+10 1.150954e-10
NVCA_PH_2022Update 62.600000 5.181665e+10 1.208106e-09
NWCO_GH_2025Update 21.300000 6.922874e+09 3.076757e-09
NWCO_LCH_2025Update 2.500000 1.377762e+09 1.814537e-09
NWCO_PH_2025Update 213.200000 7.749982e+09 2.750974e-08
NinePlan_GH_ROD 89.000000 6.522446e+10 1.364519e-09
NinePlan_PH_ROD 80.100000 4.173564e+10 1.919223e-09
NorthDakota_PH_ROD 5.666667 1.866643e+09 3.035753e-09
Oregon_GH_2025Update 4.714286 2.777459e+10 1.697338e-10
Oregon_PH_2025Update 61.300000 3.212097e+10 1.908411e-09
SDakota_GH_ROD 2.000000 4.750690e+09 4.209915e-10
SDakota_PH_ROD 8.333333 3.980246e+09 2.093673e-09
SWMT_GH_NoAction 4.285714 5.038864e+09 8.505319e-10
SWMT_PH_NoAction 23.900000 5.491400e+09 4.352260e-09
Utah_AM_ROD 2.000000 1.703760e+08 1.173874e-08
Utah_GH_ROD 7.900000 6.822016e+09 1.158016e-09
Utah_PH_NoAction 147.900000 2.283707e+10 6.476313e-09

Step 4.4: Mean number of occurrences per year¶

In [21]:
# Take the mean by year
mean_occurrences_by_year = (
    occurrence_df
    .groupby('year')
    .mean()
)

mean_occurrences_by_year
Out[21]:
occurrences area density
year
2015 23.925926 1.844972e+10 2.136301e-09
2016 23.655172 1.919683e+10 1.952154e-09
2017 30.433333 1.707023e+10 2.719608e-09
2018 28.935484 1.744539e+10 2.550014e-09
2019 25.827586 1.738651e+10 2.178385e-09
2020 29.548387 1.734764e+10 2.142464e-09
2021 34.468750 1.767348e+10 3.050800e-09
2022 40.451613 1.796332e+10 3.393427e-09
2023 48.866667 1.741098e+10 4.480251e-09
2024 59.031250 1.777261e+10 4.332340e-09

Step 4.5: Normalize by areal density, mean occurrences per year, and mean occurrences per HMA (aka region)¶

In [22]:
# Normalize by space and time for sampling effort
occurrence_df['norm_occurrences'] = (
    occurrence_df[['density']]
    / mean_occurrences_by_year[['density']]
    / mean_occurrences_by_hma[['density']]
)

# Set index - some reason this is appearing as none
#occurrence_df = occurrence_df.set_index(['priority', 'year'])

occurrence_df
C:\Users\naho5798\AppData\Local\Temp\ipykernel_76000\783070017.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  occurrence_df['norm_occurrences'] = (
Out[22]:
occurrences area density norm_occurrences
EIS_HAB year
BighornBasin_GH_ROD 2015 4 1.562871e+10 2.559392e-10 2.283408e+08
2016 10 1.562871e+10 6.398480e-10 6.247007e+08
2017 4 1.562871e+10 2.559392e-10 1.793659e+08
2018 4 1.562871e+10 2.559392e-10 1.912950e+08
2019 13 1.562871e+10 8.318024e-10 7.277713e+08
... ... ... ... ... ...
Utah_PH_NoAction 2020 135 2.283707e+10 5.911442e-09 4.260416e+08
2021 126 2.283707e+10 5.517346e-09 2.792471e+08
2022 173 2.283707e+10 7.575403e-09 3.446985e+08
2023 190 2.283707e+10 8.319807e-09 2.867366e+08
2024 261 2.283707e+10 1.142879e-08 4.073332e+08

302 rows × 4 columns

Step 4.6: Plot occurrences per priority area¶

In [33]:
# Raw occurrence data
occurrence_prio_df = (
    gbif_hma_gdf
    # Reset index
    .reset_index()
    # compare across HMA regions
    .groupby(['priority', 'year'])
    # ...count the number of occurrences
    .agg(occurrences=('gbifID', 'count'),
         area=('area', 'first'))
)

# Normalize by area
occurrence_prio_df['density'] = (
    occurrence_df.occurrences
    / occurrence_df.area
)

occurrence_prio_df.head(10)
Out[33]:
occurrences area density
priority year
High 2015 494 7.212493e+09 NaN
2016 546 7.212493e+09 NaN
2017 729 7.212493e+09 NaN
2018 703 7.212493e+09 NaN
2019 592 7.212493e+09 NaN
2020 687 7.212493e+09 NaN
2021 902 7.212493e+09 NaN
2022 1071 7.212493e+09 NaN
2023 1212 7.212493e+09 NaN
2024 1448 7.212493e+09 NaN
In [34]:
# Reset index
plot_data = occurrence_prio_df.reset_index()

# Line plot
fig, ax = plt.subplots(figsize=(10, 6))

# Plot a line for each priority level
for priority, group in plot_data.groupby('priority'):
    ax.plot(group['year'], group['occurrences'], 
            marker='o', linewidth=7, markersize=10, 
            color=prio_colors.get(priority, '#999999'),
            label=priority)

ax.set_xlabel('Year', fontsize=12, fontweight='bold')
ax.set_ylabel('Occurrences', fontsize=12, fontweight='bold')
ax.set_title('Greater Sage-Grouse Occurrences by Year and Priority', 
             fontsize=14, fontweight='bold')

# Show all years
ax.set_xticks(plot_data['year'].unique())

# Add legend
ax.legend(title='Priority Level', fontsize=10, title_fontsize=11)

# Add grid for readability
ax.grid(True, alpha=0.3, linestyle='--')

plt.tight_layout()
plt.show()
No description has been provided for this image

Step 5: Map GRSG occurrence across HMAs¶

Step 5.1: Join HMAs (geodataframe) to normalized occurrence data¶

In [23]:
# Set index for joining next cell
blm_gdf = blm_gdf.set_index('EIS_HAB')

# Reset index if needed
#blm_gdf = blm_gdf.reset_index()

# See index names
print(blm_gdf.index.name)
print(occurrence_df.index.names)
EIS_HAB
['EIS_HAB', 'year']
In [24]:
# Join the occurrences with the plotting GeoDataFrame
occurrence_gdf = blm_gdf.join(occurrence_df[['norm_occurrences']])

# Check to make sure the join worked ok --> there should be no NAs
print("Any NAs:", occurrence_gdf.isna().any())

print("Index Names:", occurrence_gdf.index.names)

occurrence_gdf.head()
Any NAs: Source              False
Habitat_Type        False
geometry            False
area                False
Area                False
HabCode             False
priority            False
norm_occurrences    False
dtype: bool
Index Names: ['EIS_HAB', 'year']
Out[24]:
Source Habitat_Type geometry area Area HabCode priority norm_occurrences
EIS_HAB year
BighornBasin_GH_ROD 2015 ROD GHMA MULTIPOLYGON (((-1025054.083 2515176.322, -102... 1.562871e+10 BighornBasin GH Medium 2.283408e+08
2016 ROD GHMA MULTIPOLYGON (((-1025054.083 2515176.322, -102... 1.562871e+10 BighornBasin GH Medium 6.247007e+08
2017 ROD GHMA MULTIPOLYGON (((-1025054.083 2515176.322, -102... 1.562871e+10 BighornBasin GH Medium 1.793659e+08
2018 ROD GHMA MULTIPOLYGON (((-1025054.083 2515176.322, -102... 1.562871e+10 BighornBasin GH Medium 1.912950e+08
2019 ROD GHMA MULTIPOLYGON (((-1025054.083 2515176.322, -102... 1.562871e+10 BighornBasin GH Medium 7.277713e+08

Step 5.2: Plot normalized occurrences by HMA and year¶

In [25]:
# Custom color plover color palette
custom_palette = [
    "#bfc09f", 
    "#aeb38e",
    "#9da67d",
    "#8c996d",
    "#7b8c5e",
    "#6a8050",
    "#597441",
    "#4a6834",
    "#3c5c27",
    "#2f501b",
    "#234510",
    "#183a06"
]

# Create a custom colormap from the palette
cmap = mcolors.ListedColormap(custom_palette)

# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
    central_longitude=-110,
    central_latitude=40,
    standard_parallels=(30, 50)
)

# Reset index to make 'year' accessible as a column
occurrence_with_year = occurrence_gdf.reset_index()

# Convert using that western US albers
occurrence_albers = occurrence_with_year.to_crs(projection)
xmin, ymin, xmax, ymax = occurrence_albers.total_bounds

# Add buffer for padding
buffer = 100000  # meters

# Create figure with 3x3 subplots
fig = plt.figure(figsize=(20, 16))

# Create state boundaries feature (define once outside loop)
states_provinces = cfeature.NaturalEarthFeature(
    category='cultural',
    name='admin_1_states_provinces_lines',
    scale='50m',
    facecolor='none'
)

# Create a subplot for each year
# Note: excluding 2015 just to make the 3x3 plot look better
for i, year_num in enumerate(range(2016, 2025)):
    # Create subplot with Albers projection
    ax = fig.add_subplot(3, 3, i + 1, projection=projection)
    
    # Filter data for current year
    year_data = occurrence_albers[occurrence_albers['year'] == year_num]
    
    # Set extent to keep zoom consistent
    ax.set_extent([
    xmin - buffer, 
    xmax + buffer, 
    ymin - buffer, 
    ymax + buffer], 
    crs=projection)
    
    # Add background features
    ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
    ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
    ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
    ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
    
    # Plot the occurrence data
    if len(year_data) > 0:
        year_data.plot(
            ax=ax,
            column='norm_occurrences',
            cmap=cmap,
            legend=False,
            zorder=2
        )
    
    # Add year as title for each panel
    ax.set_title(year_num, fontsize=14, fontweight='bold')
    
    # Add gridlines
    ax.gridlines(draw_labels=False, linewidth=0.5, color='gray', alpha=0.5, linestyle='--')

# Add overall title
fig.suptitle('Greater Sage Grouse Occurrence, 2015-2024', 
             fontsize=20, fontweight='bold', y=0.995)

# Add a single colorbar for all subplots
fig.subplots_adjust(right=0.92, hspace=0.25, wspace=0.15)
cbar_ax = fig.add_axes([0.94, 0.15, 0.015, 0.7])
sm = plt.cm.ScalarMappable(
    cmap=cmap,
    norm=mcolors.Normalize(
        vmin=occurrence_with_year['norm_occurrences'].min(),
        vmax=occurrence_with_year['norm_occurrences'].max()
    )
)
sm.set_array([])
cbar = fig.colorbar(sm, cax=cbar_ax)
cbar.set_label('Normalized occurrence', fontsize=12, fontweight='bold')

# Adjust layout
plt.tight_layout(rect=[0, 0, 0.92, 0.99])

# Show the plot
plt.show()
C:\Users\naho5798\AppData\Local\Temp\ipykernel_76000\4241496113.py:106: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  plt.tight_layout(rect=[0, 0, 0.92, 0.99])
No description has been provided for this image

Step 5.2 (part 2): check if these data look normal - there seems to be a strange pattern in the normalized figure¶

In [26]:
# Just curious if these data look normal - trying to figure out this pattern
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Histogram
axes[0].hist(occurrence_gdf['norm_occurrences'].dropna(), bins=50, edgecolor='black')
axes[0].set_xlabel('Normalized Occurrences')
axes[0].set_ylabel('Frequency')
axes[0].set_title('Distribution of Normalized Occurrences')
axes[0].axvline(occurrence_gdf['norm_occurrences'].mean(), 
                color='red', linestyle='--', label='Mean')
axes[0].axvline(occurrence_gdf['norm_occurrences'].median(), 
                color='orange', linestyle='--', label='Median')
axes[0].legend()

# Box plot by year
occurrence_gdf.reset_index().boxplot(column='norm_occurrences', by='year', ax=axes[1])
axes[1].set_xlabel('Year')
axes[1].set_ylabel('Normalized Occurrences')
axes[1].set_title('Distribution by Year')
plt.suptitle('')

plt.tight_layout()
plt.show()
No description has been provided for this image

Step 5.3: Compare with raw occurrence data (GBIF)¶

In [27]:
# Plot density of GBIF observations per year using hexbins

# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
    central_longitude=-110,
    central_latitude=40,
    standard_parallels=(30, 50)
)

# Convert GBIF data to Albers projection
gbif_albers = gbif_gdf.to_crs(projection)

# Get bounds from BLM data for consistent extent
blm_albers = blm_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_albers.total_bounds

# Add buffer for padding
buffer = 100000  # meters

# Create figure with 3x3 subplots (for years 2016-2024)
fig = plt.figure(figsize=(20, 16))

# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
    category='cultural',
    name='admin_1_states_provinces_lines',
    scale='50m',
    facecolor='none'
)

# Create a subplot for each year
for i, year_num in enumerate(range(2016, 2025)):
    # Create subplot with Albers projection
    ax = fig.add_subplot(3, 3, i + 1, projection=projection)
    
    # Filter data for current year
    year_data = gbif_albers[gbif_albers['year'] == year_num]
    
    # Set extent to keep zoom consistent
    ax.set_extent([
        xmin - buffer, 
        xmax + buffer, 
        ymin - buffer, 
        ymax + buffer], 
        crs=projection)
    
    # Add background features
    ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
    ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
    ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
    ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
    
    # Plot BLM habitat areas in light gray as context
    blm_albers.plot(ax=ax, facecolor='none', edgecolor='gray', 
                     linewidth=0.5, alpha=0.3, zorder=1)
    
    # Plot observation density using hexbin
    if len(year_data) > 0:
        x = year_data.geometry.x
        y = year_data.geometry.y
        
        hexbin = ax.hexbin(
            x, y,
            gridsize=30, 
            cmap='YlOrRd',
            mincnt=1,
            alpha=0.7,
            zorder=3,
            edgecolors='none'
        )
    
    # Add year and count
    ax.set_title(f'{year_num} (n={len(year_data)})', 
                 fontsize=14, fontweight='bold')
    
    # Add gridlines
    ax.gridlines(draw_labels=False, linewidth=0.5, 
                 color='gray', alpha=0.5, linestyle='--')

# Add title
fig.suptitle('Greater Sage-Grouse GBIF Observation Density by Year', 
             fontsize=20, fontweight='bold', y=0.995)

# Add a colorbar
fig.subplots_adjust(right=0.92, hspace=0.25, wspace=0.15)
cbar_ax = fig.add_axes([0.94, 0.15, 0.015, 0.7])
cbar = fig.colorbar(hexbin, cax=cbar_ax)
cbar.set_label('Observations per hexagon', fontsize=12, fontweight='bold')

# Adjust layout
plt.tight_layout(rect=[0, 0, 0.92, 0.99])

# Show the plot
plt.show()
C:\Users\naho5798\AppData\Local\Temp\ipykernel_76000\1277146570.py:91: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  plt.tight_layout(rect=[0, 0, 0.92, 0.99])
No description has been provided for this image

Step 5.4: Log transform raw occurrence data to smooth hotspots¶

In [28]:
# Plot density of GBIF observations per year using hexbins

# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
    central_longitude=-110,
    central_latitude=40,
    standard_parallels=(30, 50)
)

# Convert GBIF data to Albers projection
gbif_albers = gbif_gdf.to_crs(projection)

# Get bounds from BLM data for consistent extent
blm_albers = blm_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_albers.total_bounds

# Add buffer for padding
buffer = 100000  # meters

# Create figure with 3x3 subplots (for years 2016-2024)
fig = plt.figure(figsize=(20, 16))

# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
    category='cultural',
    name='admin_1_states_provinces_lines',
    scale='50m',
    facecolor='none'
)

# Create a subplot for each year
for i, year_num in enumerate(range(2016, 2025)):
    # Create subplot with Albers projection
    ax = fig.add_subplot(3, 3, i + 1, projection=projection)
    
    # Filter data for current year
    year_data = gbif_albers[gbif_albers['year'] == year_num]
    
    # Set extent to keep zoom consistent
    ax.set_extent([
        xmin - buffer, 
        xmax + buffer, 
        ymin - buffer, 
        ymax + buffer], 
        crs=projection)
    
    # Add background features
    ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
    ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
    ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
    ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
    
    # Plot BLM habitat areas in light gray as context
    blm_albers.plot(ax=ax, facecolor='none', edgecolor='gray', 
                     linewidth=0.5, alpha=0.3, zorder=1)
    
    # Plot observation density using hexbin
    if len(year_data) > 0:
        x = year_data.geometry.x
        y = year_data.geometry.y
        
        hexbin = ax.hexbin(
            x, y,
            gridsize=30, 
            cmap='YlOrRd',
            mincnt=1,
            alpha=0.7,
            zorder=3,
            edgecolors='none',
            norm=mcolors.LogNorm()
        )
    
    # Add year and count
    ax.set_title(f'{year_num} (n={len(year_data)})', 
                 fontsize=14, fontweight='bold')
    
    # Add gridlines
    ax.gridlines(draw_labels=False, linewidth=0.5, 
                 color='gray', alpha=0.5, linestyle='--')

# Add title
fig.suptitle('Greater Sage-Grouse GBIF Observation Density by Year', 
             fontsize=20, fontweight='bold', y=0.995)

# Add a colorbar
fig.subplots_adjust(right=0.92, hspace=0.25, wspace=0.15)
cbar_ax = fig.add_axes([0.94, 0.15, 0.015, 0.7])
cbar = fig.colorbar(hexbin, cax=cbar_ax)
cbar.set_label('Observations per hexagon', fontsize=12, fontweight='bold')

# Adjust layout
plt.tight_layout(rect=[0, 0, 0.92, 0.99])

# Show the plot
plt.show()
C:\Users\naho5798\AppData\Local\Temp\ipykernel_76000\2667436826.py:92: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  plt.tight_layout(rect=[0, 0, 0.92, 0.99])
No description has been provided for this image